AUC: A Better Measure than Accuracy in Comparing Learning Algorithms

نویسندگان

  • Charles X. Ling
  • Jin Huang
  • Harry Zhang
چکیده

Predictive accuracy has been widely used as the main criterion for comparing the predictive ability of classiication systems (such as C4.5, neural networks, and Naive Bayes). Most of these classiiers also produce probability estimations of the classiication, but they are completely ignored in the accuracy measure. This is often taken for granted because both training and testing sets only provide class labels. In this paper we establish rigourously that, even in this setting, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, provides a better measure than accuracy. Our result is quite signii-cant for three reasons. First, we establish, for the rst time, rigourous criteria for comparing evaluation measures for learning algorithms. Second , it suggests that AUC should replace accuracy when measuring and comparing classiication systems. Third, our result also prompts us to re-evaluate many well-established conclusions based on accuracy in machine learning. For example, it is well accepted in the machine learning community that, in terms of predictive accuracy, Naive Bayes and decision trees are very similar. Using AUC, however, we show experimentally that Naive Bayes is signiicantly better than the decision-tree learning algorithms.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

AUC: a Statistically Consistent and more Discriminating Measure than Accuracy

Predictive accuracy has been used as the main and often only evaluation criterion for the predictive performance of classification learning algorithms. In recent years, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, has been proposed as an alternative single-number measure for evaluating learning algorithms. In this paper, we prove that AUC is a better measure...

متن کامل

Comparing Naive Bayes, Decision Trees, and SVM with AUC and Accuracy

Predictive accuracy has often been used as the main and often only evaluation criterion for the predictive performance of classification or data mining algorithms. In recent years, the area under the ROC (Receiver Operating Characteristics) curve, or simply AUC, has been proposed as an alternative single-number measure for evaluating performance of learning algorithms. In our previous work, we ...

متن کامل

Combining SVM Classifiers Using Genetic Fuzzy Systems Based on AUC for Gene Expression Data Analysis

Recently, the use of Receiver Operating Characteristic (ROC) Curve and the area under the ROC Curve (AUC) has been receiving much attention as a measure of the performance of machine learning algorithms. In this paper, we propose a SVM classifier fusion model using genetic fuzzy system. Genetic algorithms are applied to tune the optimal fuzzy membership functions. The performance of SVM classif...

متن کامل

Proposing a Novel Cost Sensitive Imbalanced Classification Method based on Hybrid of New Fuzzy Cost Assigning Approaches, Fuzzy Clustering and Evolutionary Algorithms

In this paper, a new hybrid methodology is introduced to design a cost-sensitive fuzzy rule-based classification system. A novel cost metric is proposed based on the combination of three different concepts: Entropy, Gini index and DKM criterion. In order to calculate the effective cost of patterns, a hybrid of fuzzy c-means clustering and particle swarm optimization algorithm is utilized. This ...

متن کامل

پیش بینی سطح توانایی های شناختی کودکان مادران مبتلا به دیابت دوران بارداری: یک مطالعه داده‌کاوی

Background & Aim: Gestational diabetes could have harmful consequences on Children’s health. Since the initiation of gestational diabetes is simultaneous with brain evolution, this study is designed to predict evolutionary growth in children of mothers with gestational diabetes. Methods: In this study, the required data were obtained through investigating the profiles of pregnant women...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003